Goto

Collaborating Authors

 deep speech


How to Build Your Own End-to-End Speech Recognition Model in PyTorch

#artificialintelligence

Deep Learning has changed the game in speech recognition with the introduction of end-to-end models. Deep Learning has changed the game in speech recognition with the introduction of end-to-end models. These models take in audio, and directly output transcriptions. Two of the most popular end-to-end models today are Deep Speech by Baidu, and Listen Attend Spell (LAS) by Google. Both Deep Speech and LAS, are recurrent neural network (RNN) based architectures with different approaches to modeling speech recognition.


Mozilla and BMZ Announce Cooperation to Open Up Voice Technology for African Languages โ€“ The Mozilla Blog

#artificialintelligence

Today, Mozilla and the German Ministry for Economic Cooperation and Development (BMZ) have announced to join forces in the collection of open speech data in local languages, as well as the development of local innovation ecosystems for voice-enabled products and technologies. The initiative builds on the pilot project, which our Open Innovation team and the Machine Learning Group started together with the organization "Digital Umuganda" earlier this year. The Rwandan start-up collects language data in Kinyarwanda, an African language spoken by over 12 million people. Further languages in Africa and Asia are going to be added. Mozilla's projects Common Voice and Deep Speech will be the heart of the joint initiative, which aims at collecting diverse voice data and opening up a common, public database.


Baidu's AI Lab Director on Advancing Speech Recognition and Simulation

#artificialintelligence

Adam Coates is the Director of Baidu's Silicon Valley AI Lab. Craig Cannon [00:00] โ€“ Hey, this is Craig Cannon, and you're listening to Y Combinator's podcast. This episode is with Adam Coates. Adam's the director of Baidu's Silicon Valley AI Lab, and what they focus on is developing AI technologies that'll impact at least 100 million people. We spent a good chunk of this episode talking about Adam's work in speech to text and text to speech, so if you want to learn more about those projects, you can check out research.baidu.com, Today we have Adam Coates here for an interview. Adam, you run the AI lab at Baidu, in Silicon Valley. Could you just give us a quick intro and explain with Baidu is for people who don't know? Adam Coates [00:40] โ€“ Yeah, Baidu is actually the largest search engine in China. So it turns out the internet ecosystem in China is this incredibly dynamic environment. So Baidu, I think, turned out to be an early technology leader and really established itself in PC search, but then also has remade itself in the mobile revolution, and increasingly, today, is becoming an AI company, recognizing the value of AI for a whole bunch of different applications, not just search.


The Mobile Internet Is Over. Baidu Goes All In on AI

#artificialintelligence

This marathon carried on for 15 hours a day for an entire month. Clients that supplied the material received professional-grade Chinese versions of the originals at a bargain price. But Baidu Inc., the Beijing-based company that organized the mass translation, got something potentially more valuable: millions of English-Mandarin word pairs with which to train its online translation engine. China is infamous for its knockoffs, whether luxury handbags or web startups. But the country's leadership seems to understand that when it comes to artificial intelligence, cheap imitations just won't do--not when its rivals include Alphabet, Facebook, IBM, and Microsoft.


The Mobile Internet Is Over. Baidu Goes All In on AI

@machinelearnbot

On Dec. 6, 2016, thousands of translators filed into office buildings across mainland China to pore over brochures, letters, and technical manuals, all in foreign languages, painstakingly rendering their texts in Chinese characters. This marathon carried on for 15 hours a day for an entire month. Clients that supplied the material received professional-grade Chinese versions of the originals at a bargain price. But Baidu Inc., the Beijing-based company that organized the mass translation, got something potentially more valuable: millions of English-Mandarin word pairs with which to train its online translation engine. China is infamous for its knockoffs, whether luxury handbags or web startups.


The Mobile Internet Is Over. Baidu Goes All In on AI

#artificialintelligence

On Dec. 6, 2016, thousands of translators filed into office buildings across mainland China to pore over brochures, letters, and technical manuals, all in foreign languages, painstakingly rendering their texts in Chinese characters. This marathon carried on for 15 hours a day for an entire month. Clients that supplied the material received professional-grade Chinese versions of the originals at a bargain price. But Baidu Inc., the Beijing-based company that organized the mass translation, got something potentially more valuable: millions of English-Mandarin word pairs with which to train its online translation engine. China is infamous for its knockoffs, whether luxury handbags or web startups.


Baidu to Adopt Intel's New Chip for Artificial Intelligence _Life of Guangzhou

#artificialintelligence

China's biggest search engine, Baidu, announced it will use Intel's Xeon Phi processor when the processor's release plan was disclosed on August 17 at Intel's annual developer forum in San Francisco. "When it comes to AI (artificial intelligence), Intel's Xeon Phi is a great fit," said Jing Wang, a senior vice president of Baidu, who joined Diane Bryant, executive vice president in charge of Intel's data center group, at the forum. Intel said Xeon Phi will help accelerate deep learning, a computerized technique increasingly used for tasks such as interpreting speech, identifying objects in photos and piloting autonomous vehicles. Baidu, having researched the application of artificial intelligence for years, is considering using the new chip to support its voice recognition system, called Deep Speech. Deep Speech is based on the collection of 7,000 hours of voice clips created by 9,600 people.


Is China Ready to Ditch Typing?

#artificialintelligence

Google may have DeepMind, but Baidu, China's homegrown Google, has Deep Speech. Deep Speech, which debuted in December 2015, is a speech recognition system that uses an artificial neural network to translate audio input directly to transcribed output. By contrast, most speech recognition systems, including Siri, use multiple, engineer-crafted steps to make translations. The system has learned how to recognize and transcribe both English and Mandarin, and according to a Baidu paper released in February 2016, it has a recognition rate that is more accurate than most native Mandarin speakers. Baidu announced earlier in April that it will begin rolling out the deep speech technology in collaboration with Peel, a smart remote app that will be available in both English and Mandarin for Android, followed by iOS.


Baidu's Silicon Valley AI Lab Announces Collaboration with Peel at GPU Tech Conference - Baidu Research

#artificialintelligence

Example of Peel's universal remote app interface, with voice functionality enabled by Baidu Research's Deep Speech. Baidu Research and Peel announced that Baidu's Deep Speech technology is being integrated into Peel's smart home platform to create next-generation voice-enabled products. Peel offers a popular universal remote app for smartphones and tablets. It has more than 150 million users in 200 countries and 10 billion monthly remote commands. Deep Speech is a state-of-the-art speech recognition system developed using "end-to-end deep learning" by Baidu Research's Silicon Valley AI Lab (SVAIL).